SHP2 as a primordial epigenetic enzyme expunges histone H3 pTyr-54 to amend androgen receptor homeostasis

Mutations that decrease or increase the activity of the tyrosine phosphatase, SHP2 (encoded by PTPN11), promotes developmental disorders and several malignancies by varying phosphatase activity. We uncovered that SHP2 is a distinct class of an epigenetic enzyme; upon phosphorylation by the kinase ACK1/TNK2, pSHP2 was escorted by androgen receptor (AR) to chromatin, erasing hitherto unidentified pY54-H3 (phosphorylation of histones H3 at Tyr54) epigenetic marks to trigger a transcriptional program of AR. Noonan Syndrome with Multiple Lentigines (NSML) patients, SHP2 knock-in mice, and ACK1 knockout mice presented dramatic increase in pY54-H3, leading to loss of AR transcriptome. In contrast, prostate tumors with high pSHP2 and pACK1 activity exhibited progressive downregulation of pY54-H3 levels and higher AR expression that correlated with disease severity. Overall, pSHP2/pY54-H3 signaling acts as a sentinel of AR homeostasis, explaining not only growth retardation, genital abnormalities and infertility among NSML patients, but also significant AR upregulation in prostate cancer patients.


Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.

A description of all covariates tested
A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g.means) or other basic estimates (e.g.regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g.confidence intervals) For null hypothesis testing, the test statistic (e.g.F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.
For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g.Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Data analysis
ImageJ National Institutes of Health: https://imagej.nih.gov/ij/Adobe Photoshop Version 24.x Adobe: https://www.adobe.com/products/photoshop.html Adobe Illustrator Version 26.4 Adobe: https://www.adobe.com/products/illustrator.html Chip-seq Analysis Software R/Bioconductor package and MACS peak-finding software WashU Epigenome Browser, GSEA analysis Mass Spectrometry Cells were lysed in denaturing buffer containing 8 M urea, 20 mM HEPES (pH 8), 1 mM sodium orthovanadate, 2.5 mM sodium pyrophosphate and 1 mM β-glycerophosphate.Bradford assays determined the protein concentration for each sample.Protein disulfides were reduced with nature portfolio | reporting summary April 2023 4.5 mM DTT at 60 °C for 30 minutes and then cysteines were alkylated with 10 mM iodoacetamide for 20 minutes in the dark at room temperature.Trypsin digestion was carried out at room temperature overnight with enzyme to substrate ratio of 1:20, and tryptic peptides were acidified with aqueous 1% trifluoroacetic acid (TFA) and desalted with C18 Sep-Pak cartridges according to the manufacturer's procedure.Following lyophilization, peptide pellets were re-dissolved in immunoaffinity purification (IAP) buffer containing 50 mM MOPS pH 7.2, 10 mM sodium phosphate and 50 mM NaCl.Phosphotyrosine-containing peptides (pY) were immunoprecipitated with p-Tyr-1000 beads (Cell Signaling Technology #8803S).
A nanoflow ultra high-performance liquid chromatograph and nanoelectrospray orbitrap mass spectrometer (RSLCnano and Q Exactive plus, Thermo) were used for LC-MS/MS.The sample was loaded onto a pre-column (C18 PepMap100, 2 cm length x 100 μm ID packed with C18 reversed-phase resin, 5 μm particle size, 100 Å pore size) and washed for 8 minutes with aqueous 2% acetonitrile and 0.1% formic acid.Trapped peptides were eluted onto the analytical column, (C18 PepMap100, 25 cm length x 75 μm ID, 2 μm particle size, 100 Å pore size, Thermo).A 120-minute gradient was programmed as: 95% solvent A (aqueous 2% acetonitrile + 0.1% formic acid) for 8 minutes, solvent B (aqueous 90% acetonitrile + 0.1% formic acid) from 5% to 38.5% in 90 minutes, then solvent B from 50% to 90% B in 7 minutes and held at 90% for 5 minutes, followed by solvent B from 90% to 5% in 1 minute and re-equilibration for 10 minutes using a flow rate of 300 nl/min.Spray voltage was 1900 V. Capillary temperature was 275 °C.S lens RF level was set at 40.Top 16 tandem mass spectra were collected in a data-dependent manner.The resolution for MS and MS/MS were set at 70,000 and 17,500 respectively.Dynamic exclusion was 15 seconds for previously sampled peaks.Data Analysis: MaxQuant (version 1.2.2.5) was used to identify peptides using the UniProt human database and quantify the MS1 precursor intensities.Up to 2 missed trypsin cleavages were allowed.The mass tolerance was 20 ppm first search and 4.5 ppm main search.Carbamidomethyl cysteine was set as fixed modification.Phosphorylation on Serine/Threonine/Tyrosine and Methionine oxidation were set as variable modifications.Both peptide spectral match (PSM) and protein false discovery rate (FDR) were set at 0.05.Match between runs feature was activated to carry identifications across samples.For data upload to PRIDE/ProteomeXchange, similar database searches were performed with Mascot (www.matrixscience.com) in Proteome Discoverer (Thermo).
Quantitative RT-PCR Cells under various experimental conditions were for RNA isolation and cDNA preparation as described earlier48.All RT reactions were done at the same time so that the same reactions could be used for all gene studies.For the construction of standard curves, serial dilutions of pooled sample RNA were used (50, 10, 2, 0.4, 0.08, and 0.016 ng) per reverse transcriptase reaction.One "no RNA" control and one "no Reverse Transcriptase" control were included for the standard curve.Three reactions were performed for each sample: 10 ng, 0.8 ng, and a NoRT (10 ng) control.Real-time quantitative PCR analyses were performed using the ABI PRISM 7900HT Sequence Detection System (Applied Biosystems).All standards, the no template control (H2O), the No RNA control, the no Reverse Transcriptase control, and the no amplification control (Bluescript plasmid) were tested in six wells per gene (2 wells/plate x 3 plates/gene).All samples were tested in triplicate wells each for the 10 ng and 0.8 ng concentrations.PCR was carried out with SYBR® Premix Ex TaqTM II TB green premix (TaKaRa, RR82LR) using 2 μl of cDNA and the primers in a 20 μl final reaction mixture.After 2 min incubation at 50°C, reaction was activated by 10 min incubation at 95°C, followed by 40 PCR cycles consisting of 15 s of denaturation at 95°C and hybridization of primers for 1 min at 55°C.Dissociation curves were generated for each plate to verify the integrity of the primers.Data were analyzed using StepOne and StepOnePlus software version 2.3 and exported into an Excel spreadsheet.The actin or GAPDH data were used for normalizing the gene values; i.e., ng gene/ng Actin or GAPDH per well.

Computational analysis of ChIP-Seq data: Sequence analysis
The 75-nt paired end sequence reads were mapped to the genome using the BWA-MEM algorithm.Alignment information for each read was stored in the output file *.bam.Only reads that mapped uniquely with proper pairing were used in the subsequent analysis.

Determination of fragment density
Because the 5ʹ ends of the sequence reads represent the end of the ChIP or immunoprecipitation fragments, the reads were extended in silico (using MAC2) at their 3ʹ ends to a length of 173-244 bp, based on the fragment length calculated from the read pairs.Peak finding Peak regions were called using the MACS2 software with the following options -f BAMPE -SPMR -q 0.01 -broad.The "BAMPE" option was used for calculating fragment lengths from the paired end reads, "SPMR" for normalizing read depths to number of fragments per million reads, "broad" for compositing broad regions from nearby peak regions, and the qvalue (FDR) cutoff was set to 0.01.

Tissue Microarray (TMA) Analysis
The prostate TMAs were obtained from US Biomax (PR807c: 80 cases) for our study for which we are exempt from IRB approval, as no personal information about patients is sought.The tissue array slides (including positive and negative controls) were stained for the antibodies.The slides were dewaxed by heating at 65° Celsius for 60 min, washed two times, 15 min each, with xylene.Tissues were rehydrated by two series of 10 min washes in 100%, 95%, and 70% ethanol and distilled water.Antigen retrieval was performed by heating the samples at 950C for 25 min in 10 mmol/L sodium citrate (pH 6.0).The slides were cooled in PBS for 30 min, with 10 min changes of PBS and permeabilized using 0.2% Triton-X100 in PBS for 10 min.Slides were washed with PBS+0.2%Tween-20 for 10 min.After blocking with universal blocking serum (DAKO Diagnostic, Mississauga, Ontario, Canada) for 30 min, the samples were incubated with rabbit monoclonal pY580-SHP2 (1:300 dilution) and rabbit polyclonal pY54-H3 antibody (1:300 dilution) at 4 C overnight.The sections were incubated with biotin-labeled secondary and streptavidin-peroxidase for 30 min each (DAKO Diagnostic).The samples were developed with 3,39diaminobenzidine substrate (Vector Laboratories, Burlington, Ontario, Canada) and counterstained with hematoxylin.Following standard procedures, the slides were dehydrated and sealed with cover slips.The pY-SHP2 and pY54-H3 staining were examined in a blinded fashion by pathologist (C.W.).The positive reactions were scored into four grades according to the intensity of staining: 0, 1+, 2+ and 3+.
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers.We strongly encourage code deposition in a community repository (e.g.GitHub).See the Nature Portfolio guidelines for submitting code & software for further information.

April 2023
Reporting for specific materials, systems and methods We require information from authors about some types of materials, experimental systems and methods used in many studies.Here, indicate whether each material, system or method listed is relevant to your study.If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.